sampling error, undercoverage is definitely a mistake you should avoid making if you can (like spilling
coffee).
Framing Your Sample
In the previous example, the patient list is considered your sampling frame. A sampling frame
represents the practical representation of the population from which you are literally drawing your
sample. We described this list as a printout of patient names and their ages. Suppose that after the list
was printed, a few more patients joined the clinic, and a few patients stopped using the clinic because
they moved away. This situation means that your sampling frame — your list — is not a perfect
representation of the actual population from which you are drawing your sample.
If you omit population members from your sampling frame, you get undercoverage, which is a
form of non-sampling error (the type of error you want to avoid). Also, if you accidentally
include members in your sampling frame who are not part of the population (such as patients who
moved away from the clinic after you printed your list), and they actually get sampled, you have
another form of non-sampling error. Non-sampling error can also creep in from making sloppy
measurements during data collection, or making poor choices when designing your study. Chapter
8 provides guidance on how to minimize errors during data collection, and Chapters 5 and 7
provide advice on study design.
Another sampling-related vocabulary word is simulation. When talking about sampling, a simulation
refers to pretending to have data from an entire population from which you can take samples, and then
taking different samples to see what happens when you analyze the data. That way, you can make
sample statistics while peeking at what the population parameters actually are behind the scenes to see
how they behave together.
One simulation you could do to illustrate sampling error in Microsoft Excel is to create a column of
100 values that represent ages of imaginary patients at a clinic as an entire population.
If you calculated the mean of these 100 values, you would be doing a simulation of the population
parameter.
If you randomly sampled 20 of these values and calculated the mean, you would be doing a
simulation of a sample statistic.
If you compared your parameter to the statistic to see how close they were to each other, you
would be doing a simulation of sampling error.
So far we’ve reviewed several concepts related to the act of sampling. However, we haven’t yet
examined different sampling strategies. It matters how you go about taking a sample from a population;
some approaches provide a sample that is more representative of the population than other
approaches. In the next section, we consider and compare several different sampling strategies.
Sampling for Success